The essential role of time in network-based recommendation
نویسندگان
چکیده
Random walks on bipartite networks have been used extensively to design personalized recommendation methods. While aging has been identified as a key component in the growth of information networks, most research has focused on the networks’ structural properties and neglected the often available time information. Time has been largely ignored both by the investigated recommendation methods as well as by the methodology used to evaluate them. We show that this time-unaware approach overestimates the methods’ recommendation performance. Motivated by microscopic rules of network growth, we propose a time-aware modification of an existing recommendation method and show that by combining the temporal and structural aspects, it outperforms the existing methods. The performance improvements are particularly striking in systems with fast aging. Introduction. – Increasing data availability and computational capacity [1], interconnections between previously separate data domains [2, 3], and the immediate commercial importance of recommendation [4–6] all contribute to the unceasing interest in the study of recommender systems [7]. The goal of recommendation is to use data on past user preferences to obtain personalized “recommendation” of new items (shopping items, YouTube videos, or any other content) that an individual user might appreciate. From the physics perspective, it has been interesting to realize that well-known physics processes, such as random walks and heat diffusion, on network representations [8] of the underlying data give rise to efficient recommendation methods [9–12]. Despite physics being a science that aims at understanding the evolution of systems, the research of networkbased recommendation by physicists has entirely neglected the dimension of time which turns out to be of high importance for traditional recommendation approaches [13–16]. While this is understandable from the historical perspective—early datasets often lacked the time information—the situation is very different now. The role of time in the evolution of information networks (that serve as input data for recommendation) has been demonstrated [17–19], modeled [20–22], and turned into numerous useful applications [23–25]. (a)[email protected] The ignorance of time in the research of network-based recommendation manifests itself in the evaluation of recommendation methods. This evaluation is normally done by hiding part of the input data—this part is commonly referred to as the probe set—and using the rest of the data—which is referred to as the training set—as input for a recommendation method. The obtained results are finally evaluated on the basis of how well they reproduce the hidden data [26, 27]. Crucially, the probe is typically chosen at random [12, 28]; the training set therefore includes both past and future data entries. We show for the first time that when instead the latest data are hidden and thus the task is to reproduce strictly future data entries, the performance of network-based recommendation methods becomes dramatically worse than the performance observed on a random probe. Sophisticated network-based recommendation methods then turn out to be outperformed by trivial approaches such as the recent item popularity increase which has been shown to be a good predictor of the future popularity increase [29]. We propose to combine a simple network-based recommendation method with dynamical features of the network growth. The resulting hybrid method is shown to consistently and significantly outperform the existing methods on various e-commerce datasets. The problem of a random probe. – We begin by describing in detail how a probe chosen at random favors p-1 ar X iv :1 60 6. 04 66 6v 1 [ cs .I R ] 1 5 Ju n 20 16 Alexandre Vidmer Matúš Medo
منابع مشابه
Mining Overlapping Communities in Real-world Networks Based on Extended Modularity Gain
Detecting communities plays a vital role in studying group level patterns of a social network and it can be helpful in developing several recommendation systems such as movie recommendation, book recommendation, friend recommendation and so on. Most of the community detection algorithms can detect disjoint communities only, but in the real time scenario, a node can be a member of more than one ...
متن کاملAutomatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach
In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...
متن کاملUncertainty Modeling of a Group Tourism Recommendation System Based on Pearson Similarity Criteria, Bayesian Network and Self-Organizing Map Clustering Algorithm
Group tourism is one of the most important tasks in tourist recommender systems. These systems, despite of the potential contradictions among the group's tastes, seek to provide joint suggestions to all members of the group, and propose recommendations that would allow the satisfaction of a group of users rather than individual user satisfaction. Another issue that has received less attention i...
متن کاملA Review of Spatial Factor Modeling Techniques in Recommending Point of Interest Using Location-based Social Network Information
The rapid growth of mobile phone technology and its combination with various technologies like GPS has added location context to social networks and has led to the formation of location-based social networks. In social networking sites, recommender systems are used to recommend points of interest (POIs) to users. Traditional recommender systems, such as film and book recommendations, have a lon...
متن کاملA Switchgrass-based Bioethanol Supply Chain Network Design Model under Auto-Regressive Moving Average Demand
Switchgrass is known as one of the best second-generation lignocellulosic biomasses for bioethanol production. Designing efficient switchgrass-based bioethanol supply chain (SBSC) is an essential requirement for commercializing the bioethanol production from switchgrass. This paper presents a mixed integer linear programming (MILP) model to design SBSC in which bioethanol demand is under auto-r...
متن کاملMarketing Strategy Evaluation by Integrating Dynamic Systems Modeling and Network Data Envelopment Analysis
Nowadays, the service industries play an essential role in the economic development of countries, and among the various fields of insurance, life insurance is of particular importance because it covers its cover directly to humans. Increased competition in the insurance industry has led managers to seek marketing strategies that, in addition to increasing insurance sales, reduce costs and gain ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1606.04666 شماره
صفحات -
تاریخ انتشار 2016